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Abstract. SWiM is a semantic wiki for collaboratively building, edit- 
ing and browsing mathematical knowledge represented in the domain- 
specific structural semantic markup language OMDoc. It motivates users 
to contribute to collections of mathematical knowledge by instantly shar- 
ing the benefits of knowledge-powered services with them. SWiM is cur- 
rently being used for authoring content dictionaries, i. e. collections of 
uniquely identified mathematical symbols, and prepared for managing a 
large-scale proof formalisation effort. 



1 Research Background and Application Context: 
Mathematical Knowledge Management 

A great deal of scientific work consists of collaboratively authoring documents — 
taking down first hypotheses, commenting on results of experiments, circulating 
informal drafts inside a working group, and structuring, annotating, or reor- 
ganising existing items of knowledge, finally leading to the publication of a 
well-structured article or book. Here, we particularly focus on the domain of 
mathematics and on tools that support collaborative authoring by utilising 
the knowledge contained in the documents. In recent years, several semantic 
markup languages have been developed to represent the clearly defined and hi- 
erarchical structures of mathematics. The XML languages MathML [9], Open- 
Math [11], and OMDoc [3] particularly aim at exchanging mathematical knowl- 
edge on the web. OMDoc, employing Content MathML or OpenMath repre- 
senting the functional structure of mathematical formulce — as opposed to their 
visual appearance — and adding support for mathematical statements (like sym- 
bol declarations or axioms) and theories, has many applications in publishing, 
education, research, and data exchange [3, chap. 26]. The main challenge is ac- 
quiring a large collection of OMDoc-formalised knowledge that can power such 
added-value services. In an open, collaborative environment, the workload can 
be distributed among many authors, but as semantic markup makes fine-grained 
structures explicit, it is tedious to author. As the community can only benefit 
from added-value services after a substantial initial investment (writing, anno- 
tating and linking) on the author's part, we sought for motivating authors into 
action by offering "elaborate ]. . . ] services for the concrete situation" they are 
in [2]. 
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2 Key Technology: Semantic Wiki and Ontologies 

Our research is motivated by the assumption that in this context a semantic 
wiki comes in handy. OMDoc supports all levels of formalisation, from human- 
readable texts to fully formal representations for automated theorem proving, 
and semantic wikis have been found appropriate for collaboratively refining 
knowledge models (cf. [13]). User motivation in semantic wikis by instant grat- 
ification has been investigated in earlier works [1]. The ultimate goal of our 
work is to achieve a feedback loop where users are supported to contribute well- 
structured knowledge, which is then exploited to offer services, which in turn 
facilitate editing and motivate new contributions [5]. 
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Fig. 1. RDF extraction from OMDoc markup in a wiki page 



Semantic markup has deep structures: an OMDoc document can contain the- 
ories containing statements that contain formulae referring to symbols defined 
in other theories. This is uncommon for most semantic wikis, where the struc- 
tures are rather fiat and one aims at small pages to prevent editing confiicts and 
to facilitate search and navigation. So to adapt OMDoc's model of knowledge 
to a semantic wiki, we had to choose an appropriate granularity of wiki pages 
and arrived at one page holding one mathematical statement or one theory. To 
make knowledge from OMDoc documents usable on the semantic web, informa- 
tion about the resources represented by pages and their interrelations (e. g. "a 
proof for the Pythagorean theorem") are extracted to RDF. As a vocabulary 
for this, we modeled OMDoc's structures explicitly in a document ontology [5] 
in OWL-DL. This ontology contains e. g. the information that both theorems 
and proofs are specialisations of a general "mathematical statement", and that 
a proof can prove a theorem (Fig. 1). Moreover, generic transitive dependency 
and containment relations have been modeled. For example, having one theory 
import another theory (and reusing symbols defined there) establishes a depen- 
dency. One theory logically contains its statements; similarly, statements can 
contain sub-statements, as in the case of a proof that consists of multiple steps. 



3 The SWiM 0.2 Prototype: IkeWiki + OMDoc 



As a base system for the implementation, we chose IkeWiki [12]. Among the 
systems evaluated, it offered the richest XML infrastructure — a key requirement 
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for adding OMDoc support — and was found to be most extensible [4] . Its backend 
consists of a PostgreSQL database for the page contents, a Jena RDF store for 
the RDF graph and the ontologies. Additional ontologies can easily be imported. 
The frontend heavily relies on the Dojo Ajax toolkit. 

Technically, the extension of IkeWiki to SWiM required supporting OMDoc 
in addition to the HTML-like wiki page format. To foster stepwise formalisation 
of informal text, we chose to mix OMDoc fragments with wiki markup. Thus 
we could still rely on IkeWiki's WYSIWYG HTML editor, which just had to be 
enhanced by support for OMDoc XML elements. Moreover, this choice allowed 
for an easier maintenance of the OMDoc-related enhancements to the SWiM code 
base and avoided changes to the underlying database schema. The document 
ontology is preloaded into the RDF store. RDF triples are extracted from the 
OMDoc markup upon saving a page or importing an OMDoc file. Additional 
XSLT template rules care for rendering embedded OMDoc fragments. In order 
to render mathematical formulae, there is a notation definition for every semantic 
symbol. These notation definitions can be imported and edited right in the wiki, 
as parts of OMDoc documents [6]. An efficient, specialised renderer supporting 
the upcoming MathML 3 standard [10,9] applies them to the symbols in the 
formulae. In the editing view, statement- and theory-level structures of OMDoc 
are made accessible as special HTML tables, whereas mathematical formulae 
given in semantic markup are made accessible in a simplified ASCII notation 
of OpenMath. OMDoc documents are browsable via inline links manually set in 
the informal parts, via links from occurrences of symbols in formulae to the place 
of their declaration, set by the formula renderer, and via RDF links, displayed in 
a separate box by IkeWiki. The latter comprise those triples that are extracted 
from the markup (cf. Fig. 1), as well as triples inferred by a reasoner^. 

SWiM also relies on the ontology for reacting on changes to notation defini- 
tions. When an author changes a notation definition n for a symbol s, exactly 
those wiki pages that contain a formula using s or that include other pages 
containing such formulae need to be re-rendered. Looking up the symbol s ren- 
dered by n, the formulae fi using s, or pages (transitively) including the fi would 
be clumsy in the OMDoc XML sources, but is easy in the RDF graph, as this 
information is extracted from the documents and represented using ontology 
properties such as NotationDefinition-renders-Symbol and Statement-contains- 
Formula; Formula-uses-Symhol. This service allows for instant visual debugging 
of notation definitions [6]. For upcoming releases, more ontology-powered ser- 
vices are planned, including more general change management, learning assis- 
tance, and editing facilitations like editing of subsections and auto-completion of 
link targets [7]. There is some evidence that many services can be based on the 
most generic relations of dependency and (physical or logical) containment [5] . 



^ The ontology is prepared for DL reasoning, but currently only the RDFS reasoner 
built into Jena is used. 
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i IkeWiIti Help 
I Recent Changes 
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Fig. 2. A mathematical document in SWiM 



With scientists and knowledge engineers in mind, we envisage SWiM as a devel- 
opment environment that conveniently supports refactorings of knowledge^. 



4 Use Cases and Applications 

Now that viewing, browsing, editing, importing and exporting mathematical 
documents basically works, we are evaluating SWiM in practical settings. The 
Flyspeck project is about large-scale formalisation of a proof of the Kepler con- 
jecture. We are starting to support this effort by "crowdsourcing" the knowledge 
compiled so far (hundreds of proof sketches that are not yet machine- verifiable) 
on a SWiM site [8]. The main challenge is giving an interested visitor an im- 
pression of the extent of the project and, using appropriate SPARQL queries, 
showing him where work needs to be done. Currently we are investigating how 
the original I^TfTpC sources can be utilised by automatically converting them to 
HTML with MathML, then to informal OMDoc, breaking that into wiki pages, 
and letting the users formalise them stepwisely. For the upcoming OpenMath 3 
standard, SWiM is currently being extended to an editor for OpenMath Con- 
tent Dictionaries [6], which could be regarded as flat OMDoc theories that just 
define symbols and do not import anything. There, mainly editing Dublin Core 
metadata and notation definitions is of interest. 

^ This is common in mathematics, e. g. in algebra: If one just needs groups, they can 
be defined by a theory with the four well-known axioms. For exphcitly modehng 
related structures as well, one would break this into smaller theories — semigroup 
just defining an associative operation on a set, monoid importing this and extending 
it by an identity element, and finally the refactored group, adding inverse elements. 
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5 Conclusion and Related Work 

SWiM makes mathematical documents editable collaboratively and particularly 
facilitates browsing them by exploiting the knowledge they contain. Domain- 
specific services are powered by an ontology that models structures of documents — 
an advantage over generic semantic wikis, which would not be able to offer addi- 
tional services for mathematical knowledge. Competing non-semantic approaches 
like the math encyclopaedia PlanetMath (evaluated in [4]) are less flexible, as they 
cannot exploit the structures of their presentation-oriented I^TJf;X formulae and 
rely on a fixed set of metadata. Most services for editing and browsing need to be 
hard-coded, which potentially restricts the scale of knowledge managment tasks 
the systems can be applied to. The SWiM approach of integrating a semantic 
markup language into a wiki by choosing an appropriate page granularity, mod- 
eling a document ontology, and extracting relevant facts from the markup into 
RDF has successfully been applied to OMDoc and the closely related but syn- 
tactically different OpenMath [6] and is likely to be portable to other domains 
as well, e.g. for the chemical markup language CML. 
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